global_economy |>
filter(Country == "Australia") |>
autoplot(Population)Exercise solutions: Section 5.11
fpp3 5.11, Ex 1
Produce forecasts for the following series using whichever of
NAIVE(y),SNAIVE(y)orRW(y ~ drift())is more appropriate in each case:
- Australian Population (
global_economy)- Bricks (
aus_production)- NSW Lambs (
aus_livestock)- Household wealth (
hh_budget)- Australian takeaway food turnover (
aus_retail)
Australian population
Data has trend and no seasonality. Random walk with drift model is appropriate.
global_economy |>
filter(Country == "Australia") |>
model(RW(Population ~ drift())) |>
forecast(h = "10 years") |>
autoplot(global_economy)Australian clay brick production
aus_production |>
filter(!is.na(Bricks)) |>
autoplot(Bricks) +
labs(title = "Clay brick production")This data appears to have more seasonality than trend, so of the models available, seasonal naive is most appropriate.
aus_production |>
filter(!is.na(Bricks)) |>
model(SNAIVE(Bricks)) |>
forecast(h = "5 years") |>
autoplot(aus_production)NSW Lambs
nsw_lambs <- aus_livestock |>
filter(State == "New South Wales", Animal == "Lambs")
nsw_lambs |>
autoplot(Count)This data appears to have more seasonality than trend, so of the models available, seasonal naive is most appropriate.
nsw_lambs |>
model(SNAIVE(Count)) |>
forecast(h = "5 years") |>
autoplot(nsw_lambs)Household wealth
hh_budget |>
autoplot(Wealth)Annual data with trend upwards, so we can use a random walk with drift.
hh_budget |>
model(RW(Wealth ~ drift())) |>
forecast(h = "5 years") |>
autoplot(hh_budget)Australian takeaway food turnover
takeaway <- aus_retail |>
filter(Industry == "Takeaway food services") |>
summarise(Turnover = sum(Turnover))
takeaway |> autoplot(Turnover)This data has strong seasonality and strong trend, so we will use a seasonal naive model with drift.
takeaway |>
model(SNAIVE(Turnover ~ drift())) |>
forecast(h = "5 years") |>
autoplot(takeaway)This is actually not one of the four benchmark methods discussed in the book, but is sometimes a useful benchmark when there is strong seasonality and strong trend.
The corresponding equation is \[ \hat{y}_{T+h|T} = y_{T+h-m(k+1)} + \frac{h}{T-m}\sum_{t=m+1}^T(y_t - y_{t-m}), \] where \(m=12\) and \(k\) is the integer part of \((h-1)/m\) (i.e., the number of complete years in the forecast period prior to time \(T+h\)).
fpp3 5.11, Ex 3
Apply a seasonal naïve method to the quarterly Australian beer production data from 1992. Check if the residuals look like white noise, and plot the forecasts. The following code will help.
# Extract data of interest
recent_production <- aus_production |>
filter(year(Quarter) >= 1992)
# Define and estimate a model
fit <- recent_production |> model(SNAIVE(Beer))
# Look at the residuals
fit |> gg_tsresiduals()- The residuals are not centred around 0 (typically being slightly below it), this is due to the model failing to capture the negative trend in the data.
- Peaks and troughs in residuals spaced roughly 4 observations apart are apparent leading to a negative spike at lag 4 in the ACF. So they do not resemble white noise. Lags 1 and 3 are also significant, however they are very close to the threshold and are of little concern.
- The distribution of the residuals does not appear very normal, however it is probably close enough for the accuracy of our intervals (it being not centred on 0 is more concerning).
# Look at some forecasts
fit |>
forecast() |>
autoplot(recent_production)The forecasts look reasonable, although the intervals may be a bit wide. This is likely due to the slight trend not captured by the model (which subsequently violates the assumptions imposed on the residuals).
fpp3 5.11, Ex 5
Produce forecasts for the 7 Victorian series in
aus_livestockusingSNAIVE(). Plot the resulting forecasts including the historical data. Is this a reasonable benchmark for these series?
aus_livestock |>
filter(State == "Victoria") |>
model(SNAIVE(Count)) |>
forecast(h = "5 years") |>
autoplot(aus_livestock)- Most point forecasts look reasonable from the seasonal naive method.
- Some series are more seasonal than others, and for the series with very weak seasonality it may be better to consider using a naive or drift method.
- The prediction intervals in some cases go below zero, so perhaps a log transformation would have been better for these series.
fpp3 5.11, Ex 11
We will use the bricks data from
aus_production(Australian quarterly clay brick production 1956–2005) for this exercise.
- Use an STL decomposition to calculate the trend-cycle and seasonal indices. (Experiment with having fixed or changing seasonality.)
tidy_bricks <- aus_production |>
filter(!is.na(Bricks))
tidy_bricks |>
model(STL(Bricks)) |>
components() |>
autoplot()Data is multiplicative, and so a transformation should be used.
dcmp <- tidy_bricks |>
model(STL(log(Bricks))) |>
components()
dcmp |>
autoplot()Seasonality varies slightly.
dcmp <- tidy_bricks |>
model(stl = STL(log(Bricks) ~ season(window = "periodic"))) |>
components()
dcmp |> autoplot()The seasonality looks fairly stable, so I’ve used a periodic season (window). The decomposition still performs well when the seasonal component is fixed. The remainder term does not appear to contain a substantial amount of seasonality.
- Compute and plot the seasonally adjusted data.
dcmp |>
as_tsibble() |>
autoplot(season_adjust)
- Use a naïve method to produce forecasts of the seasonally adjusted data.
fit <- dcmp |>
select(-.model) |>
model(naive = NAIVE(season_adjust)) |>
forecast(h = "5 years")
dcmp |>
as_tsibble() |>
autoplot(season_adjust) + autolayer(fit)
- Use
decomposition_model()to reseasonalise the results, giving forecasts for the original data.
fit <- tidy_bricks |>
model(stl_mdl = decomposition_model(STL(log(Bricks)), NAIVE(season_adjust)))
fit |>
forecast(h = "5 years") |>
autoplot(tidy_bricks)
- Do the residuals look uncorrelated?
fit |> gg_tsresiduals()The residuals do not appear uncorrelated as there are several lags of the ACF which exceed the significance threshold.
- Repeat with a robust STL decomposition. Does it make much difference?
fit_robust <- tidy_bricks |>
model(stl_mdl = decomposition_model(STL(log(Bricks)), NAIVE(season_adjust)))
fit_robust |> gg_tsresiduals()The residuals appear slightly less auto-correlated, however there is still significant auto-correlation at lag 8.
- Compare forecasts from
decomposition_model()with those fromSNAIVE(), using a test set comprising the last 2 years of data. Which is better?
tidy_bricks_train <- tidy_bricks |>
slice(1:(n() - 8))
fit <- tidy_bricks_train |>
model(
stl_mdl = decomposition_model(STL(log(Bricks)), NAIVE(season_adjust)),
snaive = SNAIVE(Bricks)
)
fc <- fit |>
forecast(h = "2 years")
fc |>
autoplot(tidy_bricks, level = NULL)The decomposition forecasts appear to more closely follow the actual future data.
fc |>
accuracy(tidy_bricks)# A tibble: 2 × 10
.model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 snaive Test 2.75 20 18.2 0.395 4.52 0.504 0.407 -0.0503
2 stl_mdl Test 0.368 18.1 15.1 -0.0679 3.76 0.418 0.368 0.115
The STL decomposition forecasts are more accurate than the seasonal naive forecasts across all accuracy measures.